Skip to content

Enable mutex functionality in nxsem #16194

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

jlaitine
Copy link
Contributor

Summary

This enables mutex functionality in the nxsem code, enabling wait/post atomic fast paths, without needs for syscall, also when priority inheritance is enabled for mutexes.

The basics of the implementations are as follows:

  • For mutexes, replace semaphore count with mutex holder, which consits of 1 bit for mutex internal locking and 31 bits for task PID
  • Instead of manipulating the count with atomic instructions, manipulate the mutex holder
  • Instead of allocating and storing the holder structure for the task's list for priority inheritance, do it at the time when task blocks on the mutex. Since there can be only one running holder for a mutex, this can be done.

Doing this allows cleaning up the nxmutex interface to be a very thin wrapper on top of nxsem (e.g. removing the extra holder variable inside of nxmutex. Plain nxmutex could be just a typedeffed nxsem). This is left for future PRs.

This also improves performance significantly in SMP and in CONFIG_BUILD_KERNEL / CONFIG_BUILD_PROTECTED targets, since most of the time there is no need for syscall to the kernel when a mutex is taken/posted.

I am putting this as a draft for now and will do some more testing still on qemu targets

Impact

This has no functional impact. For the performance, on my real application the speed improvements on RISC-V are as follows in terms of CPU utilization:

  • RISCV 64bit (MFPS SMP, 4 harts, CONFIG_BUILD_KERNEL): CPU usage: 36 % -> 25 %
  • RISCV 64bit (MFPS SMP, 4 harts, CONFIG_BUILD_FLAT): CPU usage: 12% -> 9%

Testing

Real hardware:

  • Custom HW, mpfs (risc-v 64-bit): non-SMP and SMP 4 harts, both with CONFIG_BUILD_FLAT and CONFIG_BUILD_KERNEL.
  • Custom HW, i.MX93: Single core CONFIG_BUILD_FLAT and CONFIG_BUILD_KERNEL

@github-actions github-actions bot added Area: OS Components OS Components issues Size: L The size of the change in this PR is large labels Apr 11, 2025
@nuttxpr
Copy link

nuttxpr commented Apr 11, 2025

[Experimental Bot, please feedback here]

Yes, this PR appears to meet the NuttX requirements, although it's marked as a draft and needs more testing. The summary clearly explains the "why," "what," and "how" of the changes. The impact section addresses key areas, including performance improvements and build configurations tested. The testing section provides information about the hardware and configurations used, though the logs are missing. The lack of "before" and "after" logs is the primary reason it doesn't fully meet the requirements yet. Adding those logs, completing testing on qemu, and removing the draft status will complete the PR.

@jlaitine
Copy link
Contributor Author

This is an initial push, from which I forgot to squash one patch, so it is still broken when priority inheritance is disabled (all the testing I did was with prio inheritance enabled)

So please don't review yet, I will still fix it and add some cleanups.

This is just a FYI if someone wants to get a preliminary view ;)

@jlaitine jlaitine force-pushed the enable_mutex_functionality_in_nxsem branch 2 times, most recently from e6dbdbe to 05ef6a8 Compare April 12, 2025 14:11
@jlaitine jlaitine force-pushed the enable_mutex_functionality_in_nxsem branch from 05ef6a8 to a9e50bd Compare April 15, 2025 06:33
@jlaitine
Copy link
Contributor Author

Re-tested this, and I believe it is now ready for review. I disagreed on chaning the mholder into int32, so I didn't do that. It is obviously open for discussion still, but please see my points above first

@jlaitine jlaitine marked this pull request as ready for review April 15, 2025 06:38
@jlaitine jlaitine force-pushed the enable_mutex_functionality_in_nxsem branch from a9e50bd to 4b07113 Compare April 15, 2025 06:54
@jlaitine
Copy link
Contributor Author

Seems that I still need to work on this; citest is failing for some reason

@lupyuen
Copy link
Member

lupyuen commented Apr 15, 2025

@jlaitine I restarted the CI Test, let's see if it fails again...

@jlaitine jlaitine force-pushed the enable_mutex_functionality_in_nxsem branch 2 times, most recently from 9829870 to 6deed38 Compare April 15, 2025 12:59
@jlaitine
Copy link
Contributor Author

Fixed: bugs causing the citest to fail, review comments. Need to run a bunch of tests again.

@@ -86,7 +86,8 @@ void nxsem_recover(FAR struct tcb_s *tcb)
if (tcb->task_state == TSTATE_WAIT_SEM)
{
FAR sem_t *sem = tcb->waitobj;
DEBUGASSERT(sem != NULL && atomic_read(NXSEM_COUNT(sem)) < 0);

DEBUGASSERT(sem != NULL);

/* Restore the correct priority of all threads that hold references
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of mutex, only 1 can hold a reference, the holder itself. Should the call to nxsem_canceled be omitted when dealing with mutexes ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be several tasks blocked on a mutex, and one of them can be deleted while holding the mutex. Again, keeping the existing functionality...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, many can wait, but only 1 can hold

@jlaitine jlaitine force-pushed the enable_mutex_functionality_in_nxsem branch 4 times, most recently from 1427db1 to c3216ab Compare April 25, 2025 08:07
@jlaitine
Copy link
Contributor Author

@pussuw I now modified the nxsem_reset for mutex somewhat acc. to your feedback, and re-tested that it works for me. Please re-check that part!

@jlaitine jlaitine force-pushed the enable_mutex_functionality_in_nxsem branch from c3216ab to a50ff7a Compare April 25, 2025 13:24
… up flash space

The board has got only 64KB of flash, and is on the limit. Removing printf floating point
support frees up ~3kB of flash.

Signed-off-by: Jukka Laitinen <[email protected]>
…leted

The task which is deleted should be removed from the semaphores waitlist,
if the task happens to be blocked.

Signed-off-by: Jukka Laitinen <[email protected]>
@jlaitine jlaitine force-pushed the enable_mutex_functionality_in_nxsem branch from a50ff7a to c76e1f9 Compare April 25, 2025 13:27
This puts the mutex support fully inside nxsem, allowing
locking the mutex and setting the holder with single atomic
operation.

This enables fast mutex locking from userspace, avoiding taking
critical_sections, which may be heavy in SMP and cleanup
of nxmutex library in the future.

Signed-off-by: Jukka Laitinen <[email protected]>
… inheritance is enabled

This enables the mutex fast path for nxsem_wait, nxsem_trywait and nxsem_post also when
the priority inheritance is enabled.

Signed-off-by: Jukka Laitinen <[email protected]>
- Remove the redundant holder, as nxsem now manages hoder TID
- Remove DEBUGASSERTIONS which are managed in nxsem
- Remove the "reset" handling logic, as it is now managed in nxsem
- Inline the simplest functions

Signed-off-by: Jukka Laitinen <[email protected]>
@jlaitine jlaitine force-pushed the enable_mutex_functionality_in_nxsem branch from c76e1f9 to 810f582 Compare April 25, 2025 13:30

#define NXSEM_COUNT(s) ((FAR atomic_t *)&(s)->semcount)
#define NXSEM_COUNT(s) ((FAR atomic_t *)&(s)->val.semcount)
#define NXSEM_IS_MUTEX(s) (((s)->flags & SEM_TYPE_MUTEX) != 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove one space to align with other macros


/* Mutex related helper macros */

#define NXSEM_MBLOCKS_BIT (((uint32_t)1) << 31)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we change ALL MBLOCKS to BLOCKED?(e.g. NXSEM_BLOCKED_BIT)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure if you like it better. the mutex is not blocked, it blocks a thread, that's why it is "blocks". but obviously you can think that holder refers to the thread, so blocked is just as good


/* Check if holder value (TID) is not NO_HOLDER or RESET */

#define NXSEM_MACQUIRED(h) (!(((h) & NXSEM_NO_MHOLDER) == NXSEM_NO_MHOLDER))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#define NXSEM_MACQUIRED(h)   (((h) & NXSEM_NO_MHOLDER) != NXSEM_NO_MHOLDER)


/* Check if mutex is acquired and blocks some other task */

#define NXSEM_MBLOCKS(h) (((h) & NXSEM_MBLOCKS_BIT) != 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NXSEM_IS_MBLOCKED?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like... semaphore is not blocked, it blocks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from my understanding, this flag represents other threads is blocked(or waiting) mutex.

DEBUGASSERT(sem != NULL && atomic_read(NXSEM_COUNT(sem)) < 0);
DEBUGASSERT(sem != NULL &&
(mutex || atomic_read(NXSEM_COUNT(sem)) < 0) &&
(!mutex ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mutex && NXSEM_MBLOCKS(atomic_read(NXSEM_MHOLDER(sem))

@@ -35,30 +35,14 @@
* Pre-processor Definitions
****************************************************************************/

#define NXMUTEX_RESET ((pid_t)-2)

/****************************************************************************
* Private Functions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

@@ -120,7 +98,6 @@ int nxmutex_init(FAR mutex_t *mutex)
return ret;
}

mutex->holder = NXMUTEX_NO_HOLDER;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not inline nxmutex_init too

@@ -177,7 +122,7 @@ int nxmutex_destroy(FAR mutex_t *mutex)

bool nxmutex_is_hold(FAR mutex_t *mutex)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not inline nxmutex_is_hold

@@ -195,9 +140,11 @@ bool nxmutex_is_hold(FAR mutex_t *mutex)
*
****************************************************************************/

int nxmutex_get_holder(FAR mutex_t *mutex)
pid_t nxmutex_get_holder(FAR mutex_t *mutex)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not inline all functions in this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could do.. but if inlined function generates more code than a branch to the function, it increases code size, especially if it is frequently used. I only inlined functions for which I was sure that the code size doesn't increase

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but some functions is very simple, but not inline.

@jlaitine
Copy link
Contributor Author

@xiaoxiang781216 I disagree on your request for removing the const in many places. what is the reason why you want that? it just tells both the compiler and reader that the value is not going to change in the scope of the variable.

@xiaoxiang781216
Copy link
Contributor

@xiaoxiang781216 I disagree on your request for removing the const in many places. what is the reason why you want that? it just tells both the compiler and reader that the value is not going to change in the scope of the variable.

since the current code base never add const to the local integer variables even they doesn't change after initialization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: OS Components OS Components issues Board: arm Size: XL The size of the change in this PR is very large. Consider breaking down the PR into smaller pieces.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants